--- title: Model keywords: fastai sidebar: home_sidebar summary: "The goal of this challenge is to find all instances of dolphins in a picture and then color pixes of each dolphin with a unique color." description: "The goal of this challenge is to find all instances of dolphins in a picture and then color pixes of each dolphin with a unique color." nb_path: "notebooks/02_Model.ipynb" ---
{% raw %}
{% endraw %} {% raw %}
{% endraw %}

Model

Here is an example of how to create a model for instance segmentation:

{% raw %}
def get_instance_segmentation_model(hidden_layer_size, box_score_thresh=0.5):
    # our dataset has two classes only - background and dolphin    
    num_classes = 2
    
    # load an instance segmentation model pre-trained on COCO
    model = torchvision.models.detection.maskrcnn_resnet50_fpn(
        pretrained=True,
        box_score_thresh=0.5,
    )

    # get the number of input features for the classifier
    in_features = model.roi_heads.box_predictor.cls_score.in_features
    # replace the pre-trained head with a new one
    model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)

    # now get the number of input features for the mask classifier
    in_features_mask = model.roi_heads.mask_predictor.conv5_mask.in_channels

    model.roi_heads.mask_predictor = MaskRCNNPredictor(
        in_channels=in_features_mask, 
        dim_reduced=hidden_layer_size,
        num_classes=num_classes
    )

    return model
{% endraw %} {% raw %}
# get the model using our helper function
model = get_instance_segmentation_model(hidden_layer_size=256)

# move model to the right device
model.to(device)

# construct an optimizer
params = [p for p in model.parameters() if p.requires_grad]
optimizer = torch.optim.SGD(params, lr=0.005, momentum=0.9, weight_decay=0.0005)

# and a learning rate scheduler which decreases the learning rate by
# 10x every 3 epochs
lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)
{% endraw %} {% raw %}

train_one_epoch[source]

train_one_epoch(model, optimizer, data_loader, device, epoch, print_freq=10)

Trains one epoch of the model. Copied from the reference implementation from https://github.com/pytorch/vision.git.

{% endraw %} {% raw %}
{% endraw %}

For training the model, you can use train_one_epoch as follows:

This is how to use the training function in training loop:

{% raw %}
if Path("saved_models").exists():
    saved_model_path = Path("./saved_models/model.pt")
else:
    saved_model_path = Path("./notebooks/saved_models/model.pt")
    
if saved_model_path.exists():
    num_epochs=1
else:
    num_epochs = 20

data_loader, data_loader_test = get_dataset("segmentation", batch_size=4, get_tensor_transforms=get_my_tensor_transforms)
{% endraw %} {% raw %}
for epoch in range(num_epochs):
    # train for one epoch, printing every 10 iterations
    train_one_epoch(model, optimizer, data_loader, device, epoch, print_freq=20)
    
    # update the learning rate
    lr_scheduler.step()
{% endraw %} {% raw %}

show_prediction[source]

show_prediction(model, img:tensor([]), score_threshold:float=0.5, width:int=820)

Show a single prediction by the model

{% endraw %} {% raw %}
{% endraw %}

Show predictions on a single input image:

{% raw %}
# pick one image from the test set
img, _ = data_loader_test.dataset[0]
    
show_prediction(model, img)
{% endraw %}

We can also show predictions for the whole or for a subset of the dataset from a dataloader object:

{% raw %}

show_predictions[source]

show_predictions(model, data_loader=None, dataset=None, n=None, score_threshold=0.5, iou_df=None, width=820)

Show at most n predictions for examples in a given data loader.

{% endraw %} {% raw %}
{% endraw %}

Shows predictions for the first two elements in the data loader:

{% raw %}
show_predictions(model, data_loader=data_loader_test, n=2, score_threshold=0.5)
{% endraw %}

Metrics

{% raw %}
_, masks = get_true_and_predicted_masks(model, data_loader_test.dataset[0], 0.5)
img, _ = data_loader_test.dataset[0]

print(f'We have {masks["true"].shape[0]} dolphins on the photo, total of {masks["predicted"].shape[0]} are predicted with score higher than 0.5')

assert len(masks["true"].shape) == 3
assert len(masks["predicted"].shape) == 3

show_prediction(model, img)
We have 3 dolphins on the photo, total of 4 are predicted with score higher than 0.5
{% endraw %}

Metric explanation

For evaluating instance segmentation results, we would be using a metric called Intersection over Union or IoU. The IoU metric is a method to quantify the percent overlap between the ground-truth (or target) mask and the predicted (output) mask. i. e., the IoU metric measures the number of pixels common between the ground-truth and prediction masks divided by the total number of pixels present across both masks and is mathematically represented as:

{% raw %} $$IoU = \frac{{target \cap prediction}}{{target \cup prediction}}$$ {% endraw %}

As a visual example, let's suppose we're tasked with calculating the IoU score of a prediction mask (colored yellow), given the ground truth labeled mask (colored blue). The intersection (A∩B) is comprised of the pixels found in both the prediction mask and the ground truth mask (denoted in green color), whereas the union (A∪B) is simply comprised of all pixels found in either the prediction or target mask.

images/Iou_example1.jpg Image credits: link

As can be seen from the above example, more the intersection or overlap between the ground truth and the predicted mask, greater is the IoU metric value. The maximum IoU metric value of 1 is obtained when both the predicted mask and ground-truth mask overlap perfectly and the minimum value of 0 is obtained when there is absolutely no overlap.

{% raw %}

iou_metric_mask_pair[source]

iou_metric_mask_pair(binary_segmentation:array, binary_gt_label:array)

Compute the IOU between two binary segmentation (typically one ground truth and a predicted one). Input: binary_segmentation: binary 2D numpy array representing the region of interest as segmented by the algorithm binary_gt_label: binary 2D numpy array representing the region of interest as provided in the database Output: IOU: IOU between the segmentation and the ground truth

{% endraw %} {% raw %}
{% endraw %}

The above explanation is for a pair of single ground-truth or true mask and single predicted mask. Intersection over union metrics (IOU) for a pair of true and predicted masks can be calculated as follows:

{% raw %}
img, masks = get_true_and_predicted_masks(model, data_loader_test.dataset[0])

# calculate the metrics
iou_metric_mask_pair(
    binary_segmentation=masks["predicted"][0, :, :],
    binary_gt_label=masks["true"][0, :, :],
)
0.009998272132496626
{% endraw %}

In the instance segmentation task, a single image might predict multiple instance segmentation masks and the output prediction masks might not necessarily be in the same order as the ground-truth masks, i.e., the ordering of true and predicted masks can be and usually is different. In the example bellow, we have three true or ground truth masks but four predicted masks with score larger than 0.5:

{% raw %}

iou_metric_matrix_of_example[source]

iou_metric_matrix_of_example(model:MaskRCNN, example:Tuple[Tensor, Dict[str, Tensor]], score_threshold:float=0.5)

{% endraw %} {% raw %}
{% endraw %} {% raw %}
metrics = iou_metric_matrix_of_example(model, data_loader_test.dataset[0], 0.5)

cm = sns.light_palette("lightblue", as_cmap=True)

df = pd.DataFrame(metrics)
df.style.background_gradient(cmap=cm)
0 1 2
0 0.011288 0.000000 0.691147
1 0.586046 0.172828 0.009076
2 0.041091 0.514863 0.000000
3 0.302293 0.244689 0.000000
{% endraw %}

For a single input image, which contains multiple prediction masks and ground-truth masks (since there can be more than one dophin in the image), we first calculate the IOU metric for all the predicted and gound-truth pairs .

In the example above, we have three dolphins with three true masks, while the model predicted four masks. This is why the matrix above has four rows (corresponding to predictions) and three columns (corresponding to ground truth). The first mask predicted by the model is represented by the first row (row 0). As we can see, the best fitting is with the third true mask (column 2). The second predicted mask is represented with the second row (row 1) and the best fit is with the first true mask (column 1) and so on. The last row is an extra prediction.

Thus for a single input image, we calculate the IOU metric in such a way that the total IOU score for the image is maximized. That is, in the above example the IOU metric for the first predicted mask is taken as 0.691147 and 0.586046, 0.514863, 0.000 for the second, third and fourth respectively and take the mean of the four IOU metric values to obtain the IOU metric for single example image. The last one is an extra incorrect prediction and hence it is assigned the value of 0.000.

We repeat the above for all the images in the dataset and take the mean of the IOU values to obtain the IOU metric value for the entire dataset

{% raw %}

largest_values_in_row_colums[source]

largest_values_in_row_colums(xs:array)

Approximates the largest value in each row/column.

{% endraw %} {% raw %}
{% endraw %} {% raw %}
largest_values_in_row_colums(metrics)
[0.6911470837340903, 0.5860459463652686, 0.5148628632289045, 0.0]
{% endraw %} {% raw %}

iou_metric_example[source]

iou_metric_example(model:MaskRCNN, example:Tuple[Tensor, Dict[str, Tensor]], score_threshold:float=0.5)

{% endraw %} {% raw %}
{% endraw %}

Finally, we can get IOU metrics for the whole image:

{% raw %}
metric = iou_metric_example(model, data_loader_test.dataset[4], 0.5)

print(f"Average IOU metric on given example is {metric:.3f}")
Average IOU metric on given example is 0.393
{% endraw %} {% raw %}

iou_metric[source]

iou_metric(model:MaskRCNN, dataset:Dataset, score_threshold:float=0.5)

Calculate IOU metric on the whole dataloader

{% endraw %} {% raw %}
{% endraw %} {% raw %}
%%time

iou, iou_df = iou_metric(model, data_loader_test.dataset)

iou_df.sort_values(by="iou").style.background_gradient(cmap=cm)
CPU times: user 10.8 s, sys: 26.3 ms, total: 10.9 s
Wall time: 7.09 s
paths iou
15 data/dolphins_200_train_val/Val/JPEGImages/140724_16_2_0385.jpg 0.223536
7 data/dolphins_200_train_val/Val/JPEGImages/140701_6_1_0025.jpg 0.226512
22 data/dolphins_200_train_val/Val/JPEGImages/140830_47_1_0471.jpg 0.243580
36 data/dolphins_200_train_val/Val/JPEGImages/190706_17_1_0215.jpg 0.247639
34 data/dolphins_200_train_val/Val/JPEGImages/190627_12_2_0125.jpg 0.250209
10 data/dolphins_200_train_val/Val/JPEGImages/140720_15_1_0424.jpg 0.299945
16 data/dolphins_200_train_val/Val/JPEGImages/140728_20_1_0698.jpg 0.312832
8 data/dolphins_200_train_val/Val/JPEGImages/140704_9_1_0058.jpg 0.312906
32 data/dolphins_200_train_val/Val/JPEGImages/170829_34_1_0103.jpg 0.330877
6 data/dolphins_200_train_val/Val/JPEGImages/140701_5_1_0043.jpg 0.333474
29 data/dolphins_200_train_val/Val/JPEGImages/170723_19_1_0055.jpg 0.352385
5 data/dolphins_200_train_val/Val/JPEGImages/140426_4_1_0117.jpg 0.360652
27 data/dolphins_200_train_val/Val/JPEGImages/170612_1_1_0110.jpg 0.363840
33 data/dolphins_200_train_val/Val/JPEGImages/190611_4_1_0489.jpg 0.372803
9 data/dolphins_200_train_val/Val/JPEGImages/140717_12_1_0407.jpg 0.381662
11 data/dolphins_200_train_val/Val/JPEGImages/140720_15_1_0463.jpg 0.385915
38 data/dolphins_200_train_val/Val/JPEGImages/190819_43_1_0234.jpg 0.399510
37 data/dolphins_200_train_val/Val/JPEGImages/190819_43_1_0108.jpg 0.406923
4 data/dolphins_200_train_val/Val/JPEGImages/140426_3_1_0130.jpg 0.415232
35 data/dolphins_200_train_val/Val/JPEGImages/190701_14_1_0067.jpg 0.415491
21 data/dolphins_200_train_val/Val/JPEGImages/140810_38_1_0263.jpg 0.418434
25 data/dolphins_200_train_val/Val/JPEGImages/150724_78_1_0667.jpg 0.433198
17 data/dolphins_200_train_val/Val/JPEGImages/140810_31_1_0054.jpg 0.440690
26 data/dolphins_200_train_val/Val/JPEGImages/150728_83_1_1180.jpg 0.440916
0 data/dolphins_200_train_val/Val/JPEGImages/070729_11_2_0026.jpg 0.453103
3 data/dolphins_200_train_val/Val/JPEGImages/070828_20_1_0136.jpg 0.514249
31 data/dolphins_200_train_val/Val/JPEGImages/170808_27_1_0287.jpg 0.523139
19 data/dolphins_200_train_val/Val/JPEGImages/140810_33_1_0254.jpg 0.524899
13 data/dolphins_200_train_val/Val/JPEGImages/140724_16_1_0003.jpg 0.559169
2 data/dolphins_200_train_val/Val/JPEGImages/070828_20_1_0060.jpg 0.580168
28 data/dolphins_200_train_val/Val/JPEGImages/170612_1_1_0424.jpg 0.604693
30 data/dolphins_200_train_val/Val/JPEGImages/170723_19_1_0094.jpg 0.621366
20 data/dolphins_200_train_val/Val/JPEGImages/140810_35_3_0074.jpg 0.626951
1 data/dolphins_200_train_val/Val/JPEGImages/070730_13_2_0100.jpg 0.632828
18 data/dolphins_200_train_val/Val/JPEGImages/140810_31_1_0124.jpg 0.642030
12 data/dolphins_200_train_val/Val/JPEGImages/140720_15_1_1314.jpg 0.649436
23 data/dolphins_200_train_val/Val/JPEGImages/150722_77_1_0106.jpg 0.659817
24 data/dolphins_200_train_val/Val/JPEGImages/150724_78_1_0513.jpg 0.684226
14 data/dolphins_200_train_val/Val/JPEGImages/140724_16_2_0244.jpg 0.764574
{% endraw %} {% raw %}

show_predictions_sorted_by_iou[source]

show_predictions_sorted_by_iou(model, dataset)

{% endraw %} {% raw %}
{% endraw %} {% raw %}
show_predictions_sorted_by_iou(model, data_loader_test.dataset)
IOU metric: 0.22353647269576193
IOU metric: 0.22651195288554185
IOU metric: 0.24357953740675928
IOU metric: 0.24763923379844552
IOU metric: 0.25020883543624445
IOU metric: 0.29994482041834286
IOU metric: 0.3128317574733103
IOU metric: 0.3129064192033861
IOU metric: 0.3308765437705173
IOU metric: 0.333474038271035
IOU metric: 0.3523853480220891
IOU metric: 0.36065170737178515
IOU metric: 0.36383963081129733
IOU metric: 0.3728026692376258
IOU metric: 0.3816624869115941
IOU metric: 0.38591507183568075
IOU metric: 0.39950977074947613
IOU metric: 0.4069229482154203
IOU metric: 0.41523164745454477
IOU metric: 0.4154907134825215
IOU metric: 0.41843354007748
IOU metric: 0.4331981674477092
IOU metric: 0.44069019769563633
IOU metric: 0.44091640682702726
IOU metric: 0.45310273303189386
IOU metric: 0.514248641788219
IOU metric: 0.5231385433603896
IOU metric: 0.5248993350299268
IOU metric: 0.5591693115378393
IOU metric: 0.5801680551818325
IOU metric: 0.6046928164434141
IOU metric: 0.6213657776310128
IOU metric: 0.6269513435677614
IOU metric: 0.6328281311301569
IOU metric: 0.6420295968235277
IOU metric: 0.6494357313661979
IOU metric: 0.6598167578096537
IOU metric: 0.6842260782056337
IOU metric: 0.7645738864086551
{% endraw %}